Overview

Dataset statistics

Number of variables15
Number of observations98619
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.3 MiB
Average record size in memory120.0 B

Variable types

Text9
Categorical4
Numeric1
DateTime1

Alerts

Airport Continent is highly overall correlated with ContinentsHigh correlation
Continents is highly overall correlated with Airport ContinentHigh correlation
Passenger ID has unique valuesUnique

Reproduction

Analysis started2024-05-14 13:51:12.646188
Analysis finished2024-05-14 13:51:25.589531
Duration12.94 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

Passenger ID
Text

UNIQUE 

Distinct98619
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:25.977493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters591714
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98619 ?
Unique (%)100.0%

Sample

1st rowABVWIg
2nd rowjkXXAX
3rd rowCdUz2g
4th rowBRS38V
5th row9kvTLo
ValueCountFrequency (%)
mzwjgo 2
 
< 0.1%
pqvgsy 2
 
< 0.1%
abvwig 1
 
< 0.1%
dpafly 1
 
< 0.1%
cduz2g 1
 
< 0.1%
brs38v 1
 
< 0.1%
9kvtlo 1
 
< 0.1%
nmjkvh 1
 
< 0.1%
8ipfpe 1
 
< 0.1%
pqixby 1
 
< 0.1%
Other values (98607) 98607
> 99.9%
2024-05-14T15:51:26.498330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
h 9794
 
1.7%
p 9762
 
1.6%
X 9700
 
1.6%
2 9686
 
1.6%
3 9676
 
1.6%
L 9670
 
1.6%
z 9664
 
1.6%
U 9662
 
1.6%
Z 9655
 
1.6%
6 9655
 
1.6%
Other values (52) 494790
83.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 591714
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
h 9794
 
1.7%
p 9762
 
1.6%
X 9700
 
1.6%
2 9686
 
1.6%
3 9676
 
1.6%
L 9670
 
1.6%
z 9664
 
1.6%
U 9662
 
1.6%
Z 9655
 
1.6%
6 9655
 
1.6%
Other values (52) 494790
83.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 591714
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
h 9794
 
1.7%
p 9762
 
1.6%
X 9700
 
1.6%
2 9686
 
1.6%
3 9676
 
1.6%
L 9670
 
1.6%
z 9664
 
1.6%
U 9662
 
1.6%
Z 9655
 
1.6%
6 9655
 
1.6%
Other values (52) 494790
83.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 591714
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
h 9794
 
1.7%
p 9762
 
1.6%
X 9700
 
1.6%
2 9686
 
1.6%
3 9676
 
1.6%
L 9670
 
1.6%
z 9664
 
1.6%
U 9662
 
1.6%
Z 9655
 
1.6%
6 9655
 
1.6%
Other values (52) 494790
83.6%
Distinct8437
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:27.085511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length5.9723278
Min length2

Characters and Unicode

Total characters588985
Distinct characters56
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowEdithe
2nd rowElwood
3rd rowDarby
4th rowDominica
5th rowBay
ValueCountFrequency (%)
gale 37
 
< 0.1%
brett 36
 
< 0.1%
conny 35
 
< 0.1%
gerrie 35
 
< 0.1%
haleigh 35
 
< 0.1%
shea 34
 
< 0.1%
torey 34
 
< 0.1%
gabriel 34
 
< 0.1%
dion 33
 
< 0.1%
connie 32
 
< 0.1%
Other values (8425) 98325
99.7%
2024-05-14T15:51:27.583478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 68946
 
11.7%
a 63782
 
10.8%
i 49976
 
8.5%
n 46326
 
7.9%
r 43892
 
7.5%
l 39702
 
6.7%
o 28818
 
4.9%
t 22371
 
3.8%
y 17620
 
3.0%
s 17589
 
3.0%
Other values (46) 189963
32.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 588985
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 68946
 
11.7%
a 63782
 
10.8%
i 49976
 
8.5%
n 46326
 
7.9%
r 43892
 
7.5%
l 39702
 
6.7%
o 28818
 
4.9%
t 22371
 
3.8%
y 17620
 
3.0%
s 17589
 
3.0%
Other values (46) 189963
32.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 588985
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 68946
 
11.7%
a 63782
 
10.8%
i 49976
 
8.5%
n 46326
 
7.9%
r 43892
 
7.5%
l 39702
 
6.7%
o 28818
 
4.9%
t 22371
 
3.8%
y 17620
 
3.0%
s 17589
 
3.0%
Other values (46) 189963
32.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 588985
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 68946
 
11.7%
a 63782
 
10.8%
i 49976
 
8.5%
n 46326
 
7.9%
r 43892
 
7.5%
l 39702
 
6.7%
o 28818
 
4.9%
t 22371
 
3.8%
y 17620
 
3.0%
s 17589
 
3.0%
Other values (46) 189963
32.3%
Distinct41658
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:28.022776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21
Mean length7.0114988
Min length2

Characters and Unicode

Total characters691467
Distinct characters65
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12784 ?
Unique (%)13.0%

Sample

1st rowLeggis
2nd rowCatt
3rd rowFelgate
4th rowPyle
5th rowPencost
ValueCountFrequency (%)
de 446
 
0.4%
le 161
 
0.2%
van 144
 
0.1%
o 90
 
0.1%
di 85
 
0.1%
mc 50
 
0.1%
der 46
 
< 0.1%
la 46
 
< 0.1%
von 34
 
< 0.1%
st 34
 
< 0.1%
Other values (41381) 98794
98.9%
2024-05-14T15:51:28.599129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 74984
 
10.8%
a 53157
 
7.7%
r 49009
 
7.1%
n 48901
 
7.1%
l 44712
 
6.5%
o 44202
 
6.4%
i 42241
 
6.1%
t 33096
 
4.8%
s 29181
 
4.2%
d 20013
 
2.9%
Other values (55) 251971
36.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 691467
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 74984
 
10.8%
a 53157
 
7.7%
r 49009
 
7.1%
n 48901
 
7.1%
l 44712
 
6.5%
o 44202
 
6.4%
i 42241
 
6.1%
t 33096
 
4.8%
s 29181
 
4.2%
d 20013
 
2.9%
Other values (55) 251971
36.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 691467
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 74984
 
10.8%
a 53157
 
7.7%
r 49009
 
7.1%
n 48901
 
7.1%
l 44712
 
6.5%
o 44202
 
6.4%
i 42241
 
6.1%
t 33096
 
4.8%
s 29181
 
4.2%
d 20013
 
2.9%
Other values (55) 251971
36.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 691467
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 74984
 
10.8%
a 53157
 
7.7%
r 49009
 
7.1%
n 48901
 
7.1%
l 44712
 
6.5%
o 44202
 
6.4%
i 42241
 
6.1%
t 33096
 
4.8%
s 29181
 
4.2%
d 20013
 
2.9%
Other values (55) 251971
36.4%

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
Male
49598 
Female
49021 

Length

Max length6
Median length4
Mean length4.9941492
Min length4

Characters and Unicode

Total characters492518
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowMale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Male 49598
50.3%
Female 49021
49.7%

Length

2024-05-14T15:51:28.826339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-14T15:51:28.948683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
male 49598
50.3%
female 49021
49.7%

Most occurring characters

ValueCountFrequency (%)
e 147640
30.0%
a 98619
20.0%
l 98619
20.0%
M 49598
 
10.1%
F 49021
 
10.0%
m 49021
 
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 492518
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 147640
30.0%
a 98619
20.0%
l 98619
20.0%
M 49598
 
10.1%
F 49021
 
10.0%
m 49021
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 492518
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 147640
30.0%
a 98619
20.0%
l 98619
20.0%
M 49598
 
10.1%
F 49021
 
10.0%
m 49021
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 492518
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 147640
30.0%
a 98619
20.0%
l 98619
20.0%
M 49598
 
10.1%
F 49021
 
10.0%
m 49021
 
10.0%

Age
Real number (ℝ)

Distinct90
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45.504021
Minimum1
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:29.077817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q123
median46
Q368
95-th percentile86
Maximum90
Range89
Interquartile range (IQR)45

Descriptive statistics

Standard deviation25.929849
Coefficient of variation (CV)0.56983643
Kurtosis-1.1962683
Mean45.504021
Median Absolute Deviation (MAD)22
Skewness-0.00040861833
Sum4487561
Variance672.35705
MonotonicityNot monotonic
2024-05-14T15:51:29.246417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29 1170
 
1.2%
27 1164
 
1.2%
39 1155
 
1.2%
46 1151
 
1.2%
6 1148
 
1.2%
58 1147
 
1.2%
67 1145
 
1.2%
48 1137
 
1.2%
72 1136
 
1.2%
24 1136
 
1.2%
Other values (80) 87130
88.4%
ValueCountFrequency (%)
1 1075
1.1%
2 1058
1.1%
3 1095
1.1%
4 1110
1.1%
5 1082
1.1%
6 1148
1.2%
7 1055
1.1%
8 1119
1.1%
9 1057
1.1%
10 1097
1.1%
ValueCountFrequency (%)
90 1076
1.1%
89 1131
1.1%
88 1069
1.1%
87 1064
1.1%
86 1068
1.1%
85 1096
1.1%
84 1128
1.1%
83 1110
1.1%
82 1073
1.1%
81 1103
1.1%
Distinct240
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:29.564170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length32
Mean length7.4821992
Min length4

Characters and Unicode

Total characters737887
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowJapan
2nd rowNicaragua
3rd rowRussia
4th rowChina
5th rowChina
ValueCountFrequency (%)
china 18317
 
17.0%
indonesia 10559
 
9.8%
russia 5693
 
5.3%
philippines 5239
 
4.9%
brazil 3791
 
3.5%
portugal 3299
 
3.1%
poland 3245
 
3.0%
france 2907
 
2.7%
sweden 2397
 
2.2%
united 2317
 
2.2%
Other values (274) 49946
46.4%
2024-05-14T15:51:30.160757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 101592
13.8%
i 84945
 
11.5%
n 82105
 
11.1%
e 55508
 
7.5%
s 36294
 
4.9%
o 33260
 
4.5%
h 30481
 
4.1%
r 29881
 
4.0%
l 28536
 
3.9%
d 26232
 
3.6%
Other values (45) 229053
31.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 737887
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 101592
13.8%
i 84945
 
11.5%
n 82105
 
11.1%
e 55508
 
7.5%
s 36294
 
4.9%
o 33260
 
4.5%
h 30481
 
4.1%
r 29881
 
4.0%
l 28536
 
3.9%
d 26232
 
3.6%
Other values (45) 229053
31.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 737887
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 101592
13.8%
i 84945
 
11.5%
n 82105
 
11.1%
e 55508
 
7.5%
s 36294
 
4.9%
o 33260
 
4.5%
h 30481
 
4.1%
r 29881
 
4.0%
l 28536
 
3.9%
d 26232
 
3.6%
Other values (45) 229053
31.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 737887
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 101592
13.8%
i 84945
 
11.5%
n 82105
 
11.1%
e 55508
 
7.5%
s 36294
 
4.9%
o 33260
 
4.5%
h 30481
 
4.1%
r 29881
 
4.0%
l 28536
 
3.9%
d 26232
 
3.6%
Other values (45) 229053
31.0%
Distinct9062
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:30.623881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length65
Median length54
Mean length21.370162
Min length3

Characters and Unicode

Total characters2107504
Distinct characters149
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowColdfoot Airport
2nd rowKugluktuk Airport
3rd rowGrenoble-Isère Airport
4th rowOttawa / Gatineau Airport
5th rowGillespie Field
ValueCountFrequency (%)
airport 89110
31.7%
international 9834
 
3.5%
municipal 3762
 
1.3%
base 3631
 
1.3%
regional 3193
 
1.1%
field 2962
 
1.1%
air 2814
 
1.0%
county 2068
 
0.7%
seaplane 1546
 
0.5%
island 1443
 
0.5%
Other values (10263) 160890
57.2%
2024-05-14T15:51:31.340933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 260435
12.4%
i 183803
 
8.7%
182654
 
8.7%
o 177937
 
8.4%
a 161950
 
7.7%
t 154728
 
7.3%
n 113169
 
5.4%
p 105801
 
5.0%
A 104965
 
5.0%
e 103540
 
4.9%
Other values (139) 558522
26.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2107504
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 260435
12.4%
i 183803
 
8.7%
182654
 
8.7%
o 177937
 
8.4%
a 161950
 
7.7%
t 154728
 
7.3%
n 113169
 
5.4%
p 105801
 
5.0%
A 104965
 
5.0%
e 103540
 
4.9%
Other values (139) 558522
26.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2107504
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 260435
12.4%
i 183803
 
8.7%
182654
 
8.7%
o 177937
 
8.4%
a 161950
 
7.7%
t 154728
 
7.3%
n 113169
 
5.4%
p 105801
 
5.0%
A 104965
 
5.0%
e 103540
 
4.9%
Other values (139) 558522
26.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2107504
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 260435
12.4%
i 183803
 
8.7%
182654
 
8.7%
o 177937
 
8.4%
a 161950
 
7.7%
t 154728
 
7.3%
n 113169
 
5.4%
p 105801
 
5.0%
A 104965
 
5.0%
e 103540
 
4.9%
Other values (139) 558522
26.5%
Distinct235
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:31.707594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters197238
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowCA
3rd rowFR
4th rowCA
5th rowUS
ValueCountFrequency (%)
us 22104
22.4%
au 6370
 
6.5%
ca 5424
 
5.5%
br 4504
 
4.6%
pg 4081
 
4.1%
cn 2779
 
2.8%
id 2358
 
2.4%
ru 2247
 
2.3%
co 1643
 
1.7%
in 1486
 
1.5%
Other values (225) 45623
46.3%
2024-05-14T15:51:32.288041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 32400
16.4%
S 26623
13.5%
A 17237
 
8.7%
C 13964
 
7.1%
R 13120
 
6.7%
G 9615
 
4.9%
P 9443
 
4.8%
B 8513
 
4.3%
N 8088
 
4.1%
I 7359
 
3.7%
Other values (16) 50876
25.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 197238
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
U 32400
16.4%
S 26623
13.5%
A 17237
 
8.7%
C 13964
 
7.1%
R 13120
 
6.7%
G 9615
 
4.9%
P 9443
 
4.8%
B 8513
 
4.3%
N 8088
 
4.1%
I 7359
 
3.7%
Other values (16) 50876
25.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 197238
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
U 32400
16.4%
S 26623
13.5%
A 17237
 
8.7%
C 13964
 
7.1%
R 13120
 
6.7%
G 9615
 
4.9%
P 9443
 
4.8%
B 8513
 
4.3%
N 8088
 
4.1%
I 7359
 
3.7%
Other values (16) 50876
25.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 197238
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
U 32400
16.4%
S 26623
13.5%
A 17237
 
8.7%
C 13964
 
7.1%
R 13120
 
6.7%
G 9615
 
4.9%
P 9443
 
4.8%
B 8513
 
4.3%
N 8088
 
4.1%
I 7359
 
3.7%
Other values (16) 50876
25.8%
Distinct235
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:32.581154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length37
Mean length10.439702
Min length4

Characters and Unicode

Total characters1029553
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowCanada
3rd rowFrance
4th rowCanada
5th rowUnited States
ValueCountFrequency (%)
united 24005
 
15.6%
states 22195
 
14.4%
australia 6370
 
4.1%
canada 5424
 
3.5%
new 4826
 
3.1%
brazil 4504
 
2.9%
guinea 4263
 
2.8%
papua 4081
 
2.7%
of 3713
 
2.4%
republic 3615
 
2.4%
Other values (285) 70758
46.0%
2024-05-14T15:51:33.123921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 140744
13.7%
e 95326
 
9.3%
t 90398
 
8.8%
i 87508
 
8.5%
n 81067
 
7.9%
55135
 
5.4%
s 47271
 
4.6%
d 45809
 
4.4%
r 35957
 
3.5%
l 33136
 
3.2%
Other values (50) 317202
30.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1029553
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 140744
13.7%
e 95326
 
9.3%
t 90398
 
8.8%
i 87508
 
8.5%
n 81067
 
7.9%
55135
 
5.4%
s 47271
 
4.6%
d 45809
 
4.4%
r 35957
 
3.5%
l 33136
 
3.2%
Other values (50) 317202
30.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1029553
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 140744
13.7%
e 95326
 
9.3%
t 90398
 
8.8%
i 87508
 
8.5%
n 81067
 
7.9%
55135
 
5.4%
s 47271
 
4.6%
d 45809
 
4.4%
r 35957
 
3.5%
l 33136
 
3.2%
Other values (50) 317202
30.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1029553
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 140744
13.7%
e 95326
 
9.3%
t 90398
 
8.8%
i 87508
 
8.5%
n 81067
 
7.9%
55135
 
5.4%
s 47271
 
4.6%
d 45809
 
4.4%
r 35957
 
3.5%
l 33136
 
3.2%
Other values (50) 317202
30.8%

Airport Continent
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
NAM
32033 
AS
18637 
OC
13866 
EU
12335 
AF
11030 

Length

Max length3
Median length2
Mean length2.4334966
Min length2

Characters and Unicode

Total characters239989
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNAM
2nd rowNAM
3rd rowEU
4th rowNAM
5th rowNAM

Common Values

ValueCountFrequency (%)
NAM 32033
32.5%
AS 18637
18.9%
OC 13866
14.1%
EU 12335
 
12.5%
AF 11030
 
11.2%
SAM 10718
 
10.9%

Length

2024-05-14T15:51:33.371670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-14T15:51:33.528093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
nam 32033
32.5%
as 18637
18.9%
oc 13866
14.1%
eu 12335
 
12.5%
af 11030
 
11.2%
sam 10718
 
10.9%

Most occurring characters

ValueCountFrequency (%)
A 72418
30.2%
M 42751
17.8%
N 32033
13.3%
S 29355
12.2%
O 13866
 
5.8%
C 13866
 
5.8%
E 12335
 
5.1%
U 12335
 
5.1%
F 11030
 
4.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 239989
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 72418
30.2%
M 42751
17.8%
N 32033
13.3%
S 29355
12.2%
O 13866
 
5.8%
C 13866
 
5.8%
E 12335
 
5.1%
U 12335
 
5.1%
F 11030
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 239989
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 72418
30.2%
M 42751
17.8%
N 32033
13.3%
S 29355
12.2%
O 13866
 
5.8%
C 13866
 
5.8%
E 12335
 
5.1%
U 12335
 
5.1%
F 11030
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 239989
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 72418
30.2%
M 42751
17.8%
N 32033
13.3%
S 29355
12.2%
O 13866
 
5.8%
C 13866
 
5.8%
E 12335
 
5.1%
U 12335
 
5.1%
F 11030
 
4.6%

Continents
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
North America
32033 
Asia
18637 
Oceania
13866 
Europe
12335 
Africa
11030 

Length

Max length13
Median length7
Mean length8.7971182
Min length4

Characters and Unicode

Total characters867563
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNorth America
2nd rowNorth America
3rd rowEurope
4th rowNorth America
5th rowNorth America

Common Values

ValueCountFrequency (%)
North America 32033
32.5%
Asia 18637
18.9%
Oceania 13866
14.1%
Europe 12335
 
12.5%
Africa 11030
 
11.2%
South America 10718
 
10.9%

Length

2024-05-14T15:51:33.704759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-14T15:51:33.853571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
america 42751
30.2%
north 32033
22.7%
asia 18637
13.2%
oceania 13866
 
9.8%
europe 12335
 
8.7%
africa 11030
 
7.8%
south 10718
 
7.6%

Most occurring characters

ValueCountFrequency (%)
a 100150
11.5%
r 98149
11.3%
i 86284
9.9%
A 72418
 
8.3%
e 68952
 
7.9%
c 67647
 
7.8%
o 55086
 
6.3%
t 42751
 
4.9%
h 42751
 
4.9%
42751
 
4.9%
Other values (10) 190624
22.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 867563
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 100150
11.5%
r 98149
11.3%
i 86284
9.9%
A 72418
 
8.3%
e 68952
 
7.9%
c 67647
 
7.8%
o 55086
 
6.3%
t 42751
 
4.9%
h 42751
 
4.9%
42751
 
4.9%
Other values (10) 190624
22.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 867563
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 100150
11.5%
r 98149
11.3%
i 86284
9.9%
A 72418
 
8.3%
e 68952
 
7.9%
c 67647
 
7.8%
o 55086
 
6.3%
t 42751
 
4.9%
h 42751
 
4.9%
42751
 
4.9%
Other values (10) 190624
22.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 867563
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 100150
11.5%
r 98149
11.3%
i 86284
9.9%
A 72418
 
8.3%
e 68952
 
7.9%
c 67647
 
7.8%
o 55086
 
6.3%
t 42751
 
4.9%
h 42751
 
4.9%
42751
 
4.9%
Other values (10) 190624
22.0%
Distinct364
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
Minimum2022-01-01 00:00:00
Maximum2022-12-30 00:00:00
2024-05-14T15:51:34.050943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-14T15:51:34.297009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct9024
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:34.927965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9821333
Min length1

Characters and Unicode

Total characters294095
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowCXF
2nd rowYCO
3rd rowGNB
4th rowYND
5th rowSEE
ValueCountFrequency (%)
0 873
 
0.9%
jnb 37
 
< 0.1%
phm 36
 
< 0.1%
mpt 32
 
< 0.1%
pco 27
 
< 0.1%
yty 27
 
< 0.1%
zrz 26
 
< 0.1%
dzi 25
 
< 0.1%
aht 25
 
< 0.1%
gtf 25
 
< 0.1%
Other values (9014) 97486
98.9%
2024-05-14T15:51:35.807156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 16351
 
5.6%
S 15989
 
5.4%
B 15328
 
5.2%
M 15006
 
5.1%
L 14420
 
4.9%
C 13922
 
4.7%
T 13749
 
4.7%
K 13118
 
4.5%
N 12687
 
4.3%
R 12683
 
4.3%
Other values (21) 150842
51.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 294095
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 16351
 
5.6%
S 15989
 
5.4%
B 15328
 
5.2%
M 15006
 
5.1%
L 14420
 
4.9%
C 13922
 
4.7%
T 13749
 
4.7%
K 13118
 
4.5%
N 12687
 
4.3%
R 12683
 
4.3%
Other values (21) 150842
51.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 294095
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 16351
 
5.6%
S 15989
 
5.4%
B 15328
 
5.2%
M 15006
 
5.1%
L 14420
 
4.9%
C 13922
 
4.7%
T 13749
 
4.7%
K 13118
 
4.5%
N 12687
 
4.3%
R 12683
 
4.3%
Other values (21) 150842
51.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 294095
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 16351
 
5.6%
S 15989
 
5.4%
B 15328
 
5.2%
M 15006
 
5.1%
L 14420
 
4.9%
C 13922
 
4.7%
T 13749
 
4.7%
K 13118
 
4.5%
N 12687
 
4.3%
R 12683
 
4.3%
Other values (21) 150842
51.3%
Distinct98610
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
2024-05-14T15:51:36.489772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length27
Mean length13.983827
Min length6

Characters and Unicode

Total characters1379071
Distinct characters65
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98601 ?
Unique (%)> 99.9%

Sample

1st rowEdithe Leggis
2nd rowElwood Catt
3rd rowDarby Felgate
4th rowDominica Pyle
5th rowBay Pencost
ValueCountFrequency (%)
de 478
 
0.2%
van 166
 
0.1%
le 161
 
0.1%
di 98
 
< 0.1%
o 90
 
< 0.1%
der 59
 
< 0.1%
la 56
 
< 0.1%
von 53
 
< 0.1%
mc 50
 
< 0.1%
brett 39
 
< 0.1%
Other values (47893) 197350
99.4%
2024-05-14T15:51:37.226185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 143930
 
10.4%
a 116939
 
8.5%
99981
 
7.2%
n 95227
 
6.9%
r 92901
 
6.7%
i 92217
 
6.7%
l 84414
 
6.1%
o 73020
 
5.3%
t 55467
 
4.0%
s 46770
 
3.4%
Other values (55) 478205
34.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1379071
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 143930
 
10.4%
a 116939
 
8.5%
99981
 
7.2%
n 95227
 
6.9%
r 92901
 
6.7%
i 92217
 
6.7%
l 84414
 
6.1%
o 73020
 
5.3%
t 55467
 
4.0%
s 46770
 
3.4%
Other values (55) 478205
34.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1379071
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 143930
 
10.4%
a 116939
 
8.5%
99981
 
7.2%
n 95227
 
6.9%
r 92901
 
6.7%
i 92217
 
6.7%
l 84414
 
6.1%
o 73020
 
5.3%
t 55467
 
4.0%
s 46770
 
3.4%
Other values (55) 478205
34.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1379071
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 143930
 
10.4%
a 116939
 
8.5%
99981
 
7.2%
n 95227
 
6.9%
r 92901
 
6.7%
i 92217
 
6.7%
l 84414
 
6.1%
o 73020
 
5.3%
t 55467
 
4.0%
s 46770
 
3.4%
Other values (55) 478205
34.7%

Flight Status
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size770.6 KiB
Cancelled
32942 
On Time
32846 
Delayed
32831 

Length

Max length9
Median length7
Mean length7.668066
Min length7

Characters and Unicode

Total characters756217
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOn Time
2nd rowOn Time
3rd rowOn Time
4th rowDelayed
5th rowOn Time

Common Values

ValueCountFrequency (%)
Cancelled 32942
33.4%
On Time 32846
33.3%
Delayed 32831
33.3%

Length

2024-05-14T15:51:37.457363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-14T15:51:37.609586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
cancelled 32942
25.1%
on 32846
25.0%
time 32846
25.0%
delayed 32831
25.0%

Most occurring characters

ValueCountFrequency (%)
e 164392
21.7%
l 98715
13.1%
n 65788
8.7%
a 65773
8.7%
d 65773
8.7%
C 32942
 
4.4%
c 32942
 
4.4%
O 32846
 
4.3%
32846
 
4.3%
T 32846
 
4.3%
Other values (4) 131354
17.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 756217
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 164392
21.7%
l 98715
13.1%
n 65788
8.7%
a 65773
8.7%
d 65773
8.7%
C 32942
 
4.4%
c 32942
 
4.4%
O 32846
 
4.3%
32846
 
4.3%
T 32846
 
4.3%
Other values (4) 131354
17.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 756217
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 164392
21.7%
l 98715
13.1%
n 65788
8.7%
a 65773
8.7%
d 65773
8.7%
C 32942
 
4.4%
c 32942
 
4.4%
O 32846
 
4.3%
32846
 
4.3%
T 32846
 
4.3%
Other values (4) 131354
17.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 756217
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 164392
21.7%
l 98715
13.1%
n 65788
8.7%
a 65773
8.7%
d 65773
8.7%
C 32942
 
4.4%
c 32942
 
4.4%
O 32846
 
4.3%
32846
 
4.3%
T 32846
 
4.3%
Other values (4) 131354
17.4%

Interactions

2024-05-14T15:51:23.521677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-14T15:51:37.730557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
AgeAirport ContinentContinentsFlight StatusGender
Age1.0000.0000.0000.0100.002
Airport Continent0.0001.0001.0000.0000.006
Continents0.0001.0001.0000.0000.006
Flight Status0.0100.0000.0001.0000.000
Gender0.0020.0060.0060.0001.000

Missing values

2024-05-14T15:51:24.120434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-14T15:51:24.757888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Passenger IDFirst NameLast NameGenderAgeNationalityAirport NameAirport Country CodeCountry NameAirport ContinentContinentsDeparture DateArrival AirportPilot NameFlight Status
0ABVWIgEditheLeggisFemale62JapanColdfoot AirportUSUnited StatesNAMNorth America6/28/2022CXFEdithe LeggisOn Time
1jkXXAXElwoodCattMale62NicaraguaKugluktuk AirportCACanadaNAMNorth America12/26/2022YCOElwood CattOn Time
2CdUz2gDarbyFelgateMale67RussiaGrenoble-Isère AirportFRFranceEUEurope1/18/2022GNBDarby FelgateOn Time
3BRS38VDominicaPyleFemale71ChinaOttawa / Gatineau AirportCACanadaNAMNorth America9/16/2022YNDDominica PyleDelayed
49kvTLoBayPencostMale21ChinaGillespie FieldUSUnited StatesNAMNorth America2/25/2022SEEBay PencostOn Time
5nMJKVhLoraDurbannFemale55BrazilCoronel Horácio de Mattos AirportBRBrazilSAMSouth America6/10/2022LECLora DurbannOn Time
68IPFPERandBramMale73Ivory CoastDuxford AerodromeGBUnited KingdomEUEurope10/30/2022QFORand BramCancelled
7pqixbYPercevalDallossoMale36VietnamMaestro Wilson Fonseca AirportBRBrazilSAMSouth America4/7/2022STMPerceval DallossoCancelled
8QNAs2RAledaPigramFemale35Palestinian TerritoryVenice Marco Polo AirportITItalyEUEurope8/20/2022VCEAleda PigramOn Time
93jmudzBurlieSchustlMale13ThailandVermilion AirportCACanadaNAMNorth America4/6/2022YVGBurlie SchustlOn Time
Passenger IDFirst NameLast NameGenderAgeNationalityAirport NameAirport Country CodeCountry NameAirport ContinentContinentsDeparture DateArrival AirportPilot NameFlight Status
98609fzGKSbOlimpiaArstallFemale22ChinaWuzhou Changzhoudao AirportCNChinaASAsia4/23/2022WUZOlimpia ArstallOn Time
98610Wahnk2ChePresslandMale83FranceWarangal AirportINIndiaASAsia8/5/2022WGCChe PresslandCancelled
986110mBUjNHadriaVacherFemale41CanadaIpil AirportPHPhilippinesASAsia6/6/2022IPEHadria VacherOn Time
98612Hm8PVQOdyTinemanMale82IndonesiaFive Mile AirportUSUnited StatesNAMNorth America3/17/2022FMCOdy TinemanDelayed
98613XqX0PIOneidaOssipenkoFemale47SerbiaArugam Bay SPBLKSri LankaASAsia5/12/2022AYYOneida OssipenkoDelayed
98614hnGQ62GarethMugfordMale85ChinaHasvik AirportNONorwayEUEurope12/11/2022HAAGareth MugfordCancelled
986152omEzhKaseyBenedictFemale19RussiaAmpampamena AirportMGMadagascarAFAfrica10/30/2022IVAKasey BenedictCancelled
98616VUPiVGDarrinLuckenMale65IndonesiaAlbacete-Los Llanos AirportESSpainEUEurope9/10/2022ABCDarrin LuckenOn Time
98617E47NtSGayleLievesleyFemale34ChinaGagnoa AirportCICôte d'IvoireAFAfrica10/26/2022GGNGayle LievesleyCancelled
986188JYEczWilhelmineTouretFemale10PolandYoshkar-Ola AirportRURussian FederationEUEurope4/16/2022JOKWilhelmine TouretDelayed